Nlp Challenges for Machine Translation from English to Indian Languages
نویسندگان
چکیده
This Natural Langauge processing is carried particularly on English-Kannada/Telugu. Kannada is a language of India. The Kannada language has a classification of Dravidian, Southern, Tamil-Kannada, and Kannada. Regions Spoken: Kannada is also spoken in Karnataka, Andhra Pradesh, Tamil Nadu, and Maharashtra. Population: The total population of people who speak Kannada is 35,346,000, as of 1997. Alternate Name: Other names for Kannada are Kanarese, Canarese, Banglori, and Madrassi. Dialects: Some dialects of Kannada are Bijapur, Jeinu Kuruba, and Aine Kuruba. There are about 20 dialects and Badaga may be one. Kannada is the state language of Karnataka. About 9,000,000 people speak Kannada as a second language. The literacy rate for people who speak Kannada as a first language is about 60%, which is the same for those who speak Kannada as a second language (in India). Kannada was used in the Bible from 1831-2000. Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule-based approaches to machine translation as well as with example-based machine translation. KeywordsMultilingual Cross Langauge Information Retrieval (MCLIR), Morphology, Natural Language Processing (NLP), Statistical machine translation (SMT), Word-Sense Disambiguation (WSD)
منابع مشابه
Statistical Machine Translation for Indian Languages: Mission Hindi 2
This paper presents Centre for Development of Advanced Computing Mumbai’s (CDACM) submission to NLP Tools Contest on Statistical Machine Translation in Indian Languages (ILSMT) 2015 (collocated with ICON 2015). The aim of the contest was to collectively explore the effectiveness of Statistical Machine Translation (SMT) while translating within Indian languages and between English and Indian lan...
متن کاملŚata-Anuvādak : Tackling Multiway Translation of Indian Languages
We present a compendium of 110 Statistical Machine Translation systems built from parallel corpora of 11 Indian languages belonging to the Indo-Aryan and Dravidian families. We analyze the relationship between translation accuracy and the language families involved. We feel that insights obtained from this analysis will provide guidelines for creating machine translation systems for specific In...
متن کاملMachine Translation System in Indian Perspectives
Problem statement: In a large multilingual society like India, there is a great demand for translation of documents from one language to another language. Approach: Most of the state government works in there provincial languages, whereas the central government’s official documents and reports are in English and Hindi. Results: In order to have an appropriate communication there is a need to tr...
متن کاملMachine Translation of Indian Signs for Endocrinologist
India being the second most populated country in the world with over a billion population and over a million hearing impaired and diabetes disease patients, a translation system which can translate a given input into sign languages can be used to disseminate information to the million hearing impaired patients. Such people find it difficult to access information in common places like hospitals ...
متن کاملAn Empirical Survey on Automatic Machine Translation between English and Indian Languages
In this paper, we have reported our survey on systems and projects that intend to translate between English and Indian languages. Most of the translators and projects aim to translate from English to more than one Indian languages. The main challenge is due to the fact that Indian languages are quite different from European languages. In this paper, we have explored the following the following ...
متن کامل